Efficient Reinforcement Learning in Deterministic Systems with Value Function Generalization

نویسندگان

  • Zheng Wen
  • Benjamin Van Roy
چکیده

Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title. However, use of a template does not certify that the paper has been accepted for publication in the named journal. INFORMS journal templates are for the exclusive purpose of submitting to an INFORMS journal and should not be used to distribute the papers in print or online or to submit the papers to another publication.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Exploration and Value Function Generalization in Deterministic Systems

We consider the problem of reinforcement learning over episodes of a finitehorizon deterministic system and as a solution propose optimistic constraint propagation (OCP), an algorithm designed to synthesize efficient exploration and value function generalization. We establish that when the true value function Q⇤ lies within the hypothesis class Q, OCP selects optimal actions over all but at mos...

متن کامل

Batch Reinforcement Learning for Spoken Dialogue Systems with Sparse Value Function Approximation

In this paper, we propose to combine sample-efficient generalization frameworks for RL with a feature selection algorithm for the learning of an optimal spoken dialogue system (SDS) strategy.

متن کامل

Deciding to Specialize and Respecialize a Value Function for Relational Reinforcement Learning

We investigate the matter of feature selection in the context of relational reinforcement learning. We had previously hypothesized that it is more efficient to specialize a value function quickly, making specializations that are potentially suboptimal as a result, and to later modify that value function in the event that the agent gets it “wrong.” Here we introduce agents with the ability to ad...

متن کامل

On Determinism Handling While Learning Reduced State Space Representations

When applying a Reinforcement Learning technique to problems with continuous or very large state spaces, some kind of generalization is required. In the bibliography, two main approaches can be found. On one hand, the generalization problem can be defined as an approximation problem of the continuous value function, typically solved with neural networks. On the other hand, other approaches disc...

متن کامل

Generalization and Exploration via Randomized Value Functions

We propose randomized least-squares value iteration (RLSVI) – a new reinforcement learning algorithm designed to explore and generalize efficiently via linearly parameterized value functions. We explain why versions of least-squares value iteration that use Boltzmann or -greedy exploration can be highly inefficient, and we present computational results that demonstrate dramatic efficiency gains...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Math. Oper. Res.

دوره 42  شماره 

صفحات  -

تاریخ انتشار 2017